A general algorithm for word graph matrix decomposition
نویسندگان
چکیده
In automatic speech recognition, word graphs (lattices) are commonly used as an approximate representation of the complete word search space. Usually these word lattices are acyclic and have no a-priori structure. More recently a new class of normalized word lattices have been proposed. These word lattices (a.k.a. sausages) are very efficient (space) and they provide a normalization (chunking) of the lattice, by aligning words from all possible hypotheses. In this paper we propose a general framework for lattice chunking, the pivot algorithm. There are four important components of the pivot algorithm. First, the time information is not necessary but is beneficial for the overall performance. Second, the algorithm allows the definition of a predefined chunk structure of the final word lattice. Third, the algorithm operates on both weighted and unweighted lattices. Fourth, the labels on the graph are generic, and could be words as well as part of speech tags or parse tags. While the algorithm has applications to many tasks (e.g. parsing, named entity extraction) we present results on the performance of confidence scores for different large vocabulary speech recognition tasks. We compare the results of our algorithms against off-the-shelf methods and show significant improvements.
منابع مشابه
Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملA Parallel Multistage ILU Factorization Based on a Hierarchical Graph Decomposition
PHIDAL (Parallel Hierarchical Interface Decomposition ALgorithm) is a parallel incomplete factorization method which exploits a hierarchical interface decomposition of the adjacency graph of the coefficient matrix. The idea of the decomposition is similar to that of the well-known wirebasket techniques used in domain decomposition. However, the method is devised for general, irregularly structu...
متن کاملA practical algorithm for [r, s, t]-coloring of graph
Coloring graphs is one of important and frequently used topics in diverse sciences. In the majority of the articles, it is intended to find a proper bound for vertex coloring, edge coloring or total coloring in the graph. Although it is important to find a proper algorithm for graph coloring, it is hard and time-consuming too. In this paper, a new algorithm for vertex coloring, edge coloring an...
متن کاملDistinct edge geodetic decomposition in graphs
Let G=(V,E) be a simple connected graph of order p and size q. A decomposition of a graph G is a collection π of edge-disjoint subgraphs G_1,G_2,…,G_n of G such that every edge of G belongs to exactly one G_i,(1≤i ≤n). The decomposition 〖π={G〗_1,G_2,…,G_n} of a connected graph G is said to be a distinct edge geodetic decomposition if g_1 (G_i )≠g_1 (G_j ),(1≤i≠j≤n). The maximum cardinality of π...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003